64 research outputs found

    One-shot Learning for iEEG Seizure Detection Using End-to-end Binary Operations: Local Binary Patterns with Hyperdimensional Computing

    Full text link
    This paper presents an efficient binarized algorithm for both learning and classification of human epileptic seizures from intracranial electroencephalography (iEEG). The algorithm combines local binary patterns with brain-inspired hyperdimensional computing to enable end-to-end learning and inference with binary operations. The algorithm first transforms iEEG time series from each electrode into local binary pattern codes. Then atomic high-dimensional binary vectors are used to construct composite representations of seizures across all electrodes. For the majority of our patients (10 out of 16), the algorithm quickly learns from one or two seizures (i.e., one-/few-shot learning) and perfectly generalizes on 27 further seizures. For other patients, the algorithm requires three to six seizures for learning. Overall, our algorithm surpasses the state-of-the-art methods for detecting 65 novel seizures with higher specificity and sensitivity, and lower memory footprint.Comment: Published as a conference paper at the IEEE BioCAS 201

    ExaMon-X: a Predictive Maintenance Framework for Automatic Monitoring in Industrial IoT Systems

    Get PDF
    In recent years, the Industrial Internet of Things (IIoT) has led to significant steps forward in many industries, thanks to the exploitation of several technologies, ranging from Big Data processing to Artificial Intelligence (AI). Among the various IIoT scenarios, large-scale data centers can reap significant benefits from adopting Big Data analytics and AI-boosted approaches since these technologies can allow effective predictive maintenance. However, most of the off-the-shelf currently available solutions are not ideally suited to the HPC context, e.g., they do not sufficiently take into account the very heterogeneous data sources and the privacy issues which hinder the adoption of the cloud solution, or they do not fully exploit the computing capabilities available in loco in a supercomputing facility. In this paper, we tackle this issue, and we propose an IIoT holistic and vertical framework for predictive maintenance in supercomputers. The framework is based on a big lightweight data monitoring infrastructure, specialized databases suited for heterogeneous data, and a set of high-level AI-based functionalities tailored to HPC actors’ specific needs. We present the deployment and assess the usage of this framework in several in-production HPC systems

    Optimizing AI at the Edge: from network topology design to MCU deployment

    Get PDF
    The first topic analyzed in the thesis will be Neural Architecture Search (NAS). I will focus on two different tools that I developed, one to optimize the architecture of Temporal Convolutional Networks (TCNs), a convolutional model for time-series processing that has recently emerged, and one to optimize the data precision of tensors inside CNNs. The first NAS proposed explicitly targets the optimization of the most peculiar architectural parameters of TCNs, namely dilation, receptive field, and the number of features in each layer. Note that this is the first NAS that explicitly targets these networks. The second NAS proposed instead focuses on finding the most efficient data format for a target CNN, with the granularity of the layer filter. Note that applying these two NASes in sequence allows an "application designer" to minimize the structure of the neural network employed, minimizing the number of operations or the memory usage of the network. After that, the second topic described is the optimization of neural network deployment on edge devices. Importantly, exploiting edge platforms' scarce resources is critical for NN efficient execution on MCUs. To do so, I will introduce DORY (Deployment Oriented to memoRY) -- an automatic tool to deploy CNNs on low-cost MCUs. DORY, in different steps, can manage different levels of memory inside the MCU automatically, offload the computation workload (i.e., the different layers of a neural network) to dedicated hardware accelerators, and automatically generates ANSI C code that orchestrates off- and on-chip transfers with the computation phases. On top of this, I will introduce two optimized computation libraries that DORY can exploit to deploy TCNs and Transformers on edge efficiently. I conclude the thesis with two different applications on bio-signal analysis, i.e., heart rate tracking and sEMG-based gesture recognition

    DORY: Automatic End-to-End Deployment of Real-World DNNs on Low-Cost IoT MCUs

    Get PDF
    The deployment of Deep Neural Networks (DNNs) on end-nodes at the extreme edge of the Internet-of-Things is a critical enabler to support pervasive Deep Learning-enhanced applications. Low-Cost MCU-based end-nodes have limited on-chip memory and often replace caches with scratchpads, to reduce area overheads and increase energy efficiency -- requiring explicit DMA-based memory transfers between different levels of the memory hierarchy. Mapping modern DNNs on these systems requires aggressive topology-dependent tiling and double-buffering. In this work, we propose DORY (Deployment Oriented to memoRY) - an automatic tool to deploy DNNs on low cost MCUs with typically less than 1MB of on-chip SRAM memory. DORY abstracts tiling as a Constraint Programming (CP) problem: it maximizes L1 memory utilization under the topological constraints imposed by each DNN layer. Then, it generates ANSI C code to orchestrate off- and on-chip transfers and computation phases. Furthermore, to maximize speed, DORY augments the CP formulation with heuristics promoting performance-effective tile sizes. As a case study for DORY, we target GreenWaves Technologies GAP8, one of the most advanced parallel ultra-low power MCU-class devices on the market. On this device, DORY achieves up to 2.5x better MAC/cycle than the GreenWaves proprietary software solution and 18.1x better than the state-of-the-art result on an STM32-F746 MCU on single layers. Using our tool, GAP-8 can perform end-to-end inference of a 1.0-MobileNet-128 network consuming just 63 pJ/MAC on average @ 4.3 fps - 15.4x better than an STM32-F746. We release all our developments - the DORY framework, the optimized backend kernels, and the related heuristics - as open-source software.Comment: 14 pages, 12 figures, 4 tables, 2 listings. Accepted for publication in IEEE Transactions on Computers (https://ieeexplore.ieee.org/document/9381618

    Multi-Complexity-Loss DNAS for Energy-Efficient and Memory-Constrained Deep Neural Networks

    Full text link
    Neural Architecture Search (NAS) is increasingly popular to automatically explore the accuracy versus computational complexity trade-off of Deep Learning (DL) architectures. When targeting tiny edge devices, the main challenge for DL deployment is matching the tight memory constraints, hence most NAS algorithms consider model size as the complexity metric. Other methods reduce the energy or latency of DL models by trading off accuracy and number of inference operations. Energy and memory are rarely considered simultaneously, in particular by low-search-cost Differentiable NAS (DNAS) solutions. We overcome this limitation proposing the first DNAS that directly addresses the most realistic scenario from a designer's perspective: the co-optimization of accuracy and energy (or latency) under a memory constraint, determined by the target HW. We do so by combining two complexity-dependent loss functions during training, with independent strength. Testing on three edge-relevant tasks from the MLPerf Tiny benchmark suite, we obtain rich Pareto sets of architectures in the energy vs. accuracy space, with memory footprints constraints spanning from 75% to 6.25% of the baseline networks. When deployed on a commercial edge device, the STM NUCLEO-H743ZI2, our networks span a range of 2.18x in energy consumption and 4.04% in accuracy for the same memory constraint, and reduce energy by up to 2.2x with negligible accuracy drop with respect to the baseline.Comment: Accepted for publication at the ISLPED 2022 ACM/IEEE International Symposium on Low Power Electronics and Desig

    ECG-TCN: Wearable Cardiac Arrhythmia Detection with a Temporal Convolutional Network

    Full text link
    Personalized ubiquitous healthcare solutions require energy-efficient wearable platforms that provide an accurate classification of bio-signals while consuming low average power for long-term battery-operated use. Single lead electrocardiogram (ECG) signals provide the ability to detect, classify, and even predict cardiac arrhythmia. In this paper, we propose a novel temporal convolutional network (TCN) that achieves high accuracy while still being feasible for wearable platform use. Experimental results on the ECG5000 dataset show that the TCN has a similar accuracy (94.2%) score as the state-of-the-art (SoA) network while achieving an improvement of 16.5% in the balanced accuracy score. This accurate classification is done with 27 times fewer parameters and 37 times less multiply-accumulate operations. We test our implementation on two publicly available platforms, the STM32L475, which is based on ARM Cortex M4F, and the GreenWaves Technologies GAP8 on the GAPuino board, based on 1+8 RISC-V CV32E40P cores. Measurements show that the GAP8 implementation respects the real-time constraints while consuming 0.10 mJ per inference. With 9.91 GMAC/s/W, it is 23.0 times more energy-efficient and 46.85 times faster than an implementation on the ARM Cortex M4F (0.43 GMAC/s/W). Overall, we obtain 8.1% higher accuracy while consuming 19.6 times less energy and being 35.1 times faster compared to a previous SoA embedded implementation.Comment: 4 pages, 1 figure, 2 table

    Q-PPG: Energy-Efficient PPG-Based Heart Rate Monitoring on Wearable Devices

    Get PDF
    Hearth Rate (HR) monitoring is increasingly performed in wrist-worn devices using low-cost photoplethysmography (PPG) sensors. However, Motion Artifacts (MAs) caused by movements of the subject's arm affect the performance of PPG-based HR tracking. This is typically addressed coupling the PPG signal with acceleration measurements from an inertial sensor. Unfortunately, most standard approaches of this kind rely on hand-tuned parameters, which impair their generalization capabilities and their applicability to real data in the field. In contrast, methods based on deep learning, despite their better generalization, are considered to be too complex to deploy on wearable devices.In this work, we tackle these limitations, proposing a design space exploration methodology to automatically generate a rich family of deep Temporal Convolutional Networks (TCNs) for HR monitoring, all derived from a single "seed" model. Our flow involves a cascade of two Neural Architecture Search (NAS) tools and a hardware-friendly quantizer, whose combination yields both highly accurate and extremely lightweight models. When tested on the PPG-Dalia dataset, our most accurate model sets a new state-of-the-art in Mean Absolute Error. Furthermore, we deploy our TCNs on an embedded platform featuring a STM32WB55 microcontroller, demonstrating their suitability for real-time execution. Our most accurate quantized network achieves 4.41 Beats Per Minute (BPM) of Mean Absolute Error (MAE), with an energy consumption of 47.65 mJ and a memory footprint of 412 kB. At the same time, the smallest network that obtains a MAE < 8 BPM, among those generated by our flow, has a memory footprint of 1.9 kB and consumes just 1.79 mJ per inference

    Classification of microadenomas in patients with primary aldosteronism by steroid profiling

    Get PDF
    In primary aldosteronism (PA) the differentiation of unilateral aldosterone-producing adenomas (APA) from bilateral adrenal hyperplasia (BAH) is usually performed by adrenal venous sampling (AVS) and/or computed tomography (CT). CT alone often lacks the sensitivity to identify micro-APAs. Our objectives were to establish if steroid profiling could be useful for the identification of patients with micro-APAs and for the development of an online tool to differentiate micro-APAs, macro-APAs and BAH. The study included patients with PA (n = 197) from Munich (n = 124) and Torino (n = 73) and comprised 33 patients with micro-APAs, 95 with macro-APAs, and 69 with BAH. Subtype differentiation was by AVS, and micro- and macro-APAs were selected according to pathology reports. Steroid concentrations in peripheral venous plasma were measured by liquid chromatography-tandem mass spectrometry. An online tool using a random forest model was built for the classification of micro-APA, macro-APA and BAH. Micro-APA were classified with low specificity (33%) but macro-APA and BAH were correctly classified with high specificity (93%). Improved classification of micro-APAs was achieved using a diagnostic algorithm integrating steroid profiling, CT scanning and AVS procedures limited to patients with discordant steroid and CT results. This would have increased the correct classification of micro-APAs to 68% and improved the overall classification to 92%. Such an approach could be useful to select patients with CT-undetectable micro-APAs in whom AVS should be considered mandatory

    Identification of a serum and urine extracellular vesicle signature predicting renal outcome after kidney transplant

    Get PDF
    Background A long-standing effort is dedicated towards the identification of biomarkers allowing the prediction of graft outcome after kidney transplant. Extracellular vesicles (EVs) circulating in body fluids represent an attractive candidate, as their cargo mirrors the originating cell and its pathophysiological status. The aim of the study was to investigate EV surface antigens as potential predictors of renal outcome after kidney transplant. Methods We characterized 37 surface antigens by flow cytometry, in serum and urine EVs from 58 patients who were evaluated before, and at 10-14 days, 3 months and 1 year after transplant, for a total of 426 analyzed samples. The outcome was defined according to estimated glomerular filtration rate (eGFR) at 1 year. Results Endothelial cells and platelets markers (CD31, CD41b, CD42a and CD62P) in serum EVs were higher at baseline in patients with persistent kidney dysfunction at 1 year, and progressively decreased after kidney transplant. Conversely, mesenchymal progenitor cell marker (CD1c, CD105, CD133, SSEEA-4) in urine EVs progressively increased after transplant in patients displaying renal recovery at follow-up. These markers correlated with eGFR, creatinine and proteinuria, associated with patient outcome at univariate analysis and were able to predict patient outcome at receiver operating characteristics curves analysis. A specific EV molecular signature obtained by supervised learning correctly classified patients according to 1-year renal outcome. Conclusions An EV-based signature, reflecting the cardiovascular profile of the recipient, and the repairing/regenerative features of the graft, could be introduced as a non-invasive tool for a tailored management of follow-up of patients undergoing kidney transplant
    • …
    corecore